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5 SITE - SPECIFIC DOUBLE STRANDED VSA ENDONUCLEASB 

. This invention relates generally to a protein, more 
specifically to an endonuclease protein which has not 

10 . previously been purified or characterized from its 

natural biological source. This endonuclease is novel and 
extremely useful because it cleaves double-stranded DNA 
at specific, infrequent sites ,^ for which endonucleases 
were not previously available. The resulting fragments 

15 are of great value for human gene mapping because the 
cleavage site is a sequence ordinarily encountered in 
genomic DNA, and because cleavage by the endonuclease 
produces relatively larger fragments than characteristic 
of those produced by many previously available 

20. endonucleases. This invention also includes methods for 
purifying the endonuclease and for cleaving DNA by use of 
the endonuclease. 

(A) ye^tyji,ct3,o^ Endor>ucle^;se? 

25 . ' / 

One of the essential tools molecular biologists use 
.. to delve deeper into the mysteries of life contained in 
the structure of DNA, the genetic material, is a 
molecular scissors called a restriction endonuclease. 
30 There are many such enzymes which are capable of cutting 
DNA at specific sites (see Lewin, 1987 for review) . 

Restriction enzymes (restriction endonucleases) 
recognize specific short sequences of DNA (usually 
35 unmethylated DNA) and cleave the duplex molecule, usually 
at the target recognition site, but sometimes elsewhere. 
In some instances, the recognition site is specific, but 
the cleavage site is located some distance away from the 
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recognition site and does not appear to be at any 
specific sequence. 

"Duplex" refers to the double stranded composition 
5 of the DNA molecule • The cleavage induced by 

endonucleases is usually at specific sequences of 
approximately 4-6 base pairs. A base pair is a union of 
purines or pyrimidines in the DNA duplex. There are four 
such bases and they pair in specific unions: adenine with 
10 thymine, (A-T) , guanine with cytosine (G-C) . 

Fragments generated by endonucleases are amenable 
for further analysis of their nucleotide composition. 
Variation in the fragment sizes obtained from the same 
15i chromosomal locations among individuals^ is referred to 
as restriction fragment length polymorphism (RFLP) • 

Restriction endonucleases are essential components 
of methods used to construct maps of the genetic 

20 material, although not all such endonucleases are useful. 

Some of the problems limiting their use are that cleavage 
may be too frequent using a particular enzyme, producing 
pieces too small to be useful. Another problem is that 
the sites attacked may have nucleotide sequences that are 

25 so unusual that they are not likely to occur jLn vivo . 
Some enzymes only cleave artificially engineered 
secpiences. 

Restriction endonucleases are named by using three 
30 or four letter abbreviations identifying their origin, 
coupled with a letter and/ or number designation which 
distinguish multiple enzymes of the same origin. An 
example of the nomenclatxire is EcoRl, one of the 
endonucleases derived from Sjl coli . Most of the 
35 endonucleases discovered initially were isolated from 

bacteria, in which they cleave DNA as part of the natural 
function of the cell. However, other organisms, for 
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example, yeasty can be used as a source of double-strand 
DNA cleaving endonucleases • 

Isolation of many endonucleases occurred because the 
5 bacteria from which the endonucleases were derived were 
able to distinguish between the DNA native to the 
bacteria -and any invading foreign DNA. One of the ways 
bacteria recognize foreign DNA lis by the absence of 
methyl groups at appropriate base pair sites. The 

10 bacteria protects. its own DNA from cleavage by its own 
endonucleases, by methylation of its own DNA bases at 
appropriate target sites. Successful attack on bacteria 
by foreign DNA, for exsunple by viruses, may be due either 
to the fact that the virus DNA has the same pattern as 

15 the host DNA, or alternatively, that mutations have 

caused defects in the ability of the bacteria to produce 
an endonuclease or to attack the foreign DNA* 
Endonucleases isolated from bacteria are of two types, 
one which is only able to cleave DNA, and another in 

20 which both restriction and methylation activities are 
combined. Some restriction endonucleases introduce 
staggered cuts with overhangs while others generate blunt 
ends. 

25 (B) Restriction Mapping 

Gene maps give the location, of specific genes 
(specific DNA nucleotide sequences) that encode the 
primary sequences of protein gene products relative to 

30 each other and also localize the genes oh specific 

chromosomes of higher organisms . A map of DNA obtained 
by using endonucleases to map bresUcpolnts is called a 
restriction site map and consists of a linear sequence of 
cleavage sites. This physical map is obtained by 

35 extracting DNA from the chromosomes in cells, breaking 
the extracted DNA at yarious points with endonucleases, 
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and determining the order of cleavage sites by analysis 
of the fragments. 

Distances along the maps axe measured directly in 
5.. base pairs or, if distances are long, in megabase pairs. 
By comparing the sequences of DNA between relatively 
short distances, a DNA map is constructed in a. stepwise 
fashion. A major goal of current research is to 
construct a map of the entire human genome, (The Humcui 

10 Genome Project ^ American Society of Human Genetics 

Symposium, Baltimore, Nov. 15, 1989.) Success in mapping 
human and animal genomes will require a selection of 
endonucleases which cleave at a large variety of sites 
which occur in the DNA of living organisms, not just in 

15 axrtificial secpiences. 

DNA fragments produced by the action of 
endonucleases are separated on the basis of size by 
agarose or polyacrylamide gel electrophoresis. An 

20 electric current is passed through the gel, causing the 
fragments to move down it at a rate depending on length; 
the smaller fragments move more rapidly. The result of 
this migration in a gel, is a series of bands each 
corresponding to a fragment of a particular size. Many 

25 different endonucleases are used for gene mapping, and 
large ntunbers of overlapping fragments are analyzed. 
Sequential cleavage using different endonucleases 
produces a series of larger fragments broken down into 
smaller fragments, A hierarchy is then constructed based 

30 on the fact that there is complete additivity of length 
of the fragments within the original starting fragment. 
Por example, a fragment of 2,100 base pairs may be broken 
down into 200 and 1900 base pairs, (see Lewin, 1987 for 
review) ♦ 

35 

. Construction of an entire gene map for a species, 
for example construction of the human gene map, is a 
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difficult and tedious task. The larger the number of 
endonucleases available for restriction mapping, the 
easier and more sophisticated the genetic map 
construction. In particular, many endonucleases are 
5 heeded which cleave at a variety of specific sites and 
which produce fragments of different lengths. To 
appreciate the magnitude of the mapping problem it should 
be noted that an estimated 3 billion base pairs contained 
in 22 pairs of human chromosomes called autosomes plus 
10 two sex chromosomes, comprise the human genome. 

Restriction maps represent advantages over older 
methods of mapping which identified a series of genetic 
sites because of the occurrence of DNA chsmges 

15 (mutations), because restrict ion. maps can.be obtained for 
any sequence of DNA* Their construction is not dependent 
upon the location of mutations, and no knowledge of the 
function of a particular sequence of PNA. is required. 
However, restriction maps are related to, and are 

20 generally colinear with, "genetic** maps. 

Mutations which are deletions or insertions of base 
pairs may be detected in restriction maps by noting an 
alteration of the length of a restriction fragment in 

25 which the mutation lies. Base-pair change type of 

mutations may be detected if their presence inactivates 
or creates a cleavage site of a particular endonuclease, 
altering the length of the restriction fragmeints produced 
by cleavage in the area of DNA containing the mutation 

30 lies. 

rev Restr iction Fragment Length 
pQlvffiorphlsms fRPLP) 

35 Different alleles (conditions of a gene) may lead to 

^ the production of different proteins and subsequent 
variation in the phenotype, (the detectable physical, 
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biochemical, or physiologic makeup of the organism) . 
Variation of DNA within populations is called genetic 
polymorphism. Even if the polymorphism does not lead to 
detectaible changes in the phenotype by physical 
5' appearance or biochemical assays, genetic polymorphism 
may be detectable by variations in the DNA restriction 
fragment lengths (RFLP) . Polymorphic variation in the 
restriction map therefore is independent of gene 
function. 

10: 

RFLP's have numerous applications including as 
markers for paternity testing or determining the location 
of specific genes. For exan^le, mutant genes responsible 
for inherited diseases such as Huntington's Chorea, a 

15 progressive neurological degeneration, have been 
localized to specific chromosomes in humans by 
correlating inheritance of RFLP's in families with the 
inheritance of the particular clinical condition. RFLP 
patterns of family members who are normal are compared 

20 with patterns of family members who are affected with a 
particular genetic disease. 

im — other uses for Resliric-hin^ 
Endonucleases 

25 

Another use of restriction endonucleases is to 
create -and use cloning vectors for the transmission of 
DNA sequences. For this purpose, the gene of interest 
needs to be attached to the vector fragment. One way 

30 this may be accomplished is by generating complementary 
DNA sequences on the vector and on the gene of interest 
so that they can be united (recombined) . Some 
restriction endonucleases make staggered cuts which 
generate short, complementary, single stranded "sticky 

35 ends" of the DNA. An example of such an action is that 
effected by the EcoRl endonuclease which cleaves each of 
the two strands of duplex DNA at a different point. 
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These cleavage sites lie on either side of a short 
sequence that is part of the site recognized by the 
endonuclease. When two different DNA molecules are 
cleaved with EcoRl the same sticlcy ends are generated 
5 which enables them to combine with each other. The DNA 
fragment can then be retrieved by cleaving the vector 
with EcoRl to release the gene. 

(E) Exons and Introns 
10 . 

The restriction map of DNA may not correspond 
directly with the coding sequence of messenger RNA 
produced by the DNA because DNA sequences of the total 
gene may consist of. exons and introns. Exons are that 
15 part of the DNA code that appear, in the messenger RNA. 

Host, but not all, exons code for proteins. Introns are 
DNA sequences that are usually spliced out of the RNA 
product before the messenger RNA proceeds to be 
translated into proteins Splicing consists of a 
20 deletion of the intron from the primary RNA transcript 

and a joining or fusion of the ends of the remaining RNA 
on either side of the excised intron. Presence or 
absence of introns, the composition of introns, and 
number of introns per gene, may vary among strains of the 
25 same species, and among species having the same basic 
functional gene; Although in most cases, introns are 
assumed to. be nonessential and benign, their 
categorization is not absolute. For example, an intron 
of one gene can represent an exon of another. A mosaic 
30 gene is defined as one which is expressed through the 
splicing together of exons carried by one molecule of 
RNA. In some cases, alternate or different patterns of 
splicing can generate different proteins from the same 
single stretch of DNA (Lewin, 1987) . 



35 
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fF) Mitochondrial D NA: Yeast Mitochondria 

The DNA contained in mitochondria (cell organelles 
which contain extranuclear DNA) represents an example of 
differences arising dxiring evolution between the 
composition of genes with regard to the exon and intron 
sequences as well as non-coding sequences. For example, 
comparing the mitochondrial genes of yeast with those in 
mammalian systems, indicates that identical mitochondrial 
proteins are produced despite the disparity in evolution 
between these species. The yeast mitochondrial genomes, 
however, are much larger than those occurring in 
vertebrates due to the absence of introns in the latter 
and the presence of noncoding spacer DNA in the former. 

Primary DNA sequence data are known for many yeast 
isolates (see deZamaroczy and Bernardi, 1986) in which 
inter strain differences are due to (1) a small number of 
large deletions/additions, mainly concerning introns; 
(11) a large number of small (10-150 bp) 
deletions/additions located in the intergenic sequences; 
(iii> 1-3 bp deletions/additions and point mutations. In 
ga,c:ct^q^y9mYpes cereveslae the size of mtDNA can range up 
to about 84 kilobases, approximately 2/3 of which is 
non-coding regions. There are more than 20 mitochondria 
per cell, i.e., approximately 4 genomes per 
mitochondrion. In comparison, vertebrate mitochondrial 
DNA is approximately 17 kilobase pairs. In the 
individual mitochondrion there are usually several copies 
of a single molecule of DNA. Moreover, there are very 
many mitochondria per cell. In plants and some 
unicellular eiikaryotes, extranuclear DNA is also found in 
chloroplasts. 



The DMA within the mitochondria directs protein 
synthesis, just as nuclear DNA does. However, a finite 
number of proteins are produced. There are general 
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similarities in the machinery for gene expression in 
mitochondria of various species, rendering products and 
information derived in one genus applicable to many 
others. For some products, the cytochrome, c oxidase in 
5 yeast, for example, protein synthesis combines factors 
produced in the cytoplasm encoded by nuclear DNA with 
those synthesized directly in the mitochondria. 
Mutations identifying almost all the mitochondrial genes 
have been detected. Nuclear mutations that interact 

10 with, or abolish, the effects of these mutational 

complexes in the mitochondria, have also been found. 
Genes coding for many of the same functions are present 
both in the yeast and the mammalian mitochondrial genome, 
meUcing the yeast mitochondria a good model system for 

15 testing theories on gene expression in higher organisms. 
The mitochondrial genome of one of the yeast strains, 
saccharomvces cerevisiae has provided both information on 
genetic expression, as well as products which can be 
useful for analysis of higher systems. (Levin, 1987: 

20 Butow 1985). 

In the invention described herein, a new and unique 
restriction endonuclease has been isolated and purified. 
One obstacle to pxirif ication and characterization of this 

25 enzyme in the past has been the inability to accumulate 
sufficient amounts of the protein, a problem which has 
been solved by methods disclosed in this invention. A 
preferred source of the endonuclease described in this 
invention is yeast mitochondria from a special strain. 

30 The endonuclease has wide applications for ia vivo or in 
vitro cleavage of double-stranded DNA from many genera. 

In certain aspects, this invention is directed 
towards a siibstantially purified endonuclease preparation 
35 having an apparent molecular weight of 31,000 daltons as 
determined by SDS polyacrylamide gel electrophoresis. 
This endonuclease is capsUsle of cleaving double-stranded 
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DNA at specific sites (Figxire 1). Within these sites, 
certain specific nucleotides are essential for cleavage, 
as indicated in Figure 1; at other sites, base 
substitution is compatible with cleavage. 

5 

The endonuclease described in this invention is 
further defined as having a biological activity of up to 
about 100 units/mg of protein in the crude extract. One 
unit of endonuclease activity is defined as that amount 
10| of enzyme that catalyzes the cleavage of 50ng of a DNA 

molecule in one hour at 3d*C. , although other definitions 
of activity would be within the scope of this invention. 

Purification of the endonuclease by standard protein 
15 purification techniques was accompanied at each 
sequential step by increased activity of the 
endonuclease. Specific examples comprise a biological 
activity up to about 34,000 units/mg after 
phosphocellulose chromatography; 50,000 units/mg after 
20 Affigel Heparin chromatography; 200,000 units after gel 
filtration; and 500,000 units/mg after DNA affinity 
chromatography (Figures 3-6, Table 1). 

The endonuclease is translated from a fusion between 
25 the upstream exons of the mitochondrial cytochrome 

oxidase subunit I gene (ggxl) of yeast (Figure 1) and the 
open reading frame (ORF) within the 4th intron (aI4a) of 
the (coxl) gene. The endonuclease was capable of 
cleaving recipient DNA molecules near the site of yeast 
30 mitochondrial coxl aI4a intron insertion (Figures 7-8) . 

The endonuclease is also capable of acting as a maturase 
under certain conditions. Examples of such conditions 
include presence of a point mutation in the intron 
reading frame of the mim-2 mutation, (Dujardin et al., 
35 1982) or the presence of the nuclear NAH2 gene (Herbert 
et al., 1988) . 
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A preferred source of the endonuclease is 
, mitochondria, more specifically yeast mitochondria. An 
object of this invention was to describe a method of 
isolating and purifying the endonuclease from yeast 
5 mitochondria, more specifically qpcr^ngrPWYg^^ ggrgyigja^ 
mitochondria. The. WA12 strain of yeast containing mtDNA 
derived from strain ID41-6/161/PZ27 was a preferred 
source. 

10 A method is disclosed for preparing the 

endonuclease. The preferred embodiment comprises 
culturing yeast that are incapable of splicing the aI4a 
intron of the coxl gene, thereby accumulating sufficient 
amounts of endonuclease to use for isolation, 

15 purification, and characterization. The method further 
comprises preparing mitochondrial extracts from yeast, 
fractionating the extracts, and selecting a fraction. or 
fractions which contain the endonuclease disclosed in 
this invention. The endonuclease is further identified 

20 by its ability to cleave double stranded DNA at the 
specific site: 



25 



C 

AA C CT TA 

GGi— nGrn ^ "^m *^r~i 

Tt|g|g|t|c| ATCC A|G|AA|GjTAT^ 



Another object of this invention is a method of 
cleaving DNA. This method comprises preparing the 
endonuclease described herein, by the method disclosed i: 
this invention, and incubating the isolated and purified 
endonuclease protein with DNA so as to effectuate the 
endpnudeolytic cleaving of the DNA. 

Other objects and advantages of the invention will 
become apparent upon reading the following detailed 
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description and upon reference to the drawings in which 
aspects of the invention are illustrated as follows: 

FIGURE l: The DNA cleavage site of the aT4tt encoded 
5 endonucleaae , 

The cleavage site specificity of the endonuclease is 
shown. Boxed residues are those believed to be essential 
for cleavage. The nucleotide changes indicated above the 
10 non-boxed nucleotides, are those that have been tested 

and found to permit cleavage. It is possible that other 
nucleotides will be identified which can substitute at 
these and other positions. The enzyme cut sites eure 
indicated by the vertical and horizontal lines. 

15 

FIGURE 2: ^ntron conf Icmrations of the an^i a^.n^ 

The coxl gene of yeast mtDNA contains up to ten 
introns, three of which were found recently in coxl genes 

20 from newly studied isolates (Kotylalc et al., 1985; Ralph, 
1986; Ralph et al., unpublished data). The intron 
configuration of this gene in strains D273-10B and s. 
yioyjaepsj,^ Y-12,656, is shown in lines 1 and 2, 
respectively. The coxl gene structure of a recombinant 

25 resulting from conversion of aI4a is shown in line 3. 

All three strains shown also contain introns all and aI2. 
Because two of the new introns are present in strains 
used in this study, a new nomenclature for coxl gene 
introns was used. Previously described introns 3 and 4 

30 are renamed aI3a and aI4a, respectively. One new intron, 
located between aI3a and aI4a, is called alSy (because 
another new intron, aI3j9, is present between aI3a and 
aI3Y) in St douqJ,asii [Kotylak et al., 1985]) • Exon 
sequences are indicated by dark boxes whereas introns are 

35 designated by open or shaded boxes. Arrows indicate 
regions sequenced as a basis for this invention • The 
asterisk denotes the location of two sequence differences 
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between strains D273rlOB and S, norbensls located in exon 
5a. The abbreviations used are: B, BamHI; Be, BCII; H, 
Haelll; E, EcoRl (Wenzlau et al., 1989). 

5 FIGURE 3: Purification of the restriction 

endonuclease bv phosphocellulose chromatoorarhy 

Fraction la (see Table I) was dialyzed (M, cutoff , 
10,000) against ice cold buffer containing 50 nM 

10 potassitim phosphate, pH 7.5, 10% glycerol, 2 tM EDTA, 2 
mM DTT (Buffer A) until an ionic equivalent of 
approximately 100 mM KCl was achieved. The sample was 
loaded onto a phosphocellulose column (5 mg protein per 
packed ml of phosphocellulose) , previously equilibrated 

15 at a flow rate of 20 ml/hour with Buffer A. The column 
was washed with 120 ml of Buffer A containing 200 mM KCl 
at a rate of 60 ml/hour and then a 240 ml linear gradient 
was applied using Buffer A containing 200 mM KCl and 1 H 
KCl at a rate of 8 0ml /hour. The aI4a endonuclease 

20 activity eluted at 0.7 M KCl. Active fractions were 

pooled (fraction II, Table I) and dialyzed against Buffer 
A until an ionic equivalent of 50 mM KCl was reached. 

FIGURE 4: Purification of the restriction 
. 25 endonuclease bv affiael herarin chromatography 

The dialyzed fraction II (Table I) having 
endonuclease activity and resulting from phosphocellulose 
chromatography (Figure 3) was applied to an Affigel 

30 Heparin column, previously equilibrated with Buffer A 
. (0.^1 mg protein per packed ml of Affigel Heparin), at a 
rate of 2.5 ml/hour. The column was washed with 15 ml of 
buffer A containing 50 mM KCl at 7 ml/hour and eluted 
with a 60 ml linear gradient of Buffer A containing 100 

35 mM to 600 mM KCl at a rate of 15 ml/hour. The enzyme 
eluted at 300 mM KCl, and the active fractions were 
pooled (fraction III, Table I) . 




• 
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FIGURE 5: PttTlf icatlon of the restriction 
endonuciease bv ael f iltration 

Fraction III (Table I, Figure 4) was applied to a 
Sephacryl HS 200 column (1.4 cm x 85 cm) ecpiilibrated 
with Buffer A containing 100 nH KCI at a rate of 3.2 
ml/hour. The column was calibrated with blue dextran 
2000, B-amylase, yeast alcohol dehydrogenase, lactate 
dehydrogenase, boving serum albumin, carbonic anhydrase 
and cytochrome c. The column was eluted with the Buffer 
A and 0.95 ml fractions were collected. Fractions 
containing aI4a endonuciease activity were pooled 
(fraction IV). 

FIGURE 6: Purificati on of the restriction 
endonudease bv DNA affinity ehromatoaraphv 

Fraction IV (Table I, Figure 5) was applied to a DNA 
affinity column equilibrated with Buffer A containing 100 
mM KCI at 1.5 ml/hour. The column was washed with 6 ml 
of the same buffer at 6 ml/hour and eluted with 18 ml of 
a linear gradient of Buffer A containing 100 ntM KCl to 
800 mM KCI at a rate of 9ml/hour. The al.4a endonuciease 
activity eluted at 300 inM KCl. The active fractions were 
pooled and mixed with aai equal volume of 50 uH potassium 
phosphate, pH 7.5, 80% glycerol and 2 ikH EDTA and stored 
at -20'0C (fraction V, Table I) . 

FIGURE 7: In Vivo dOUble-Stranded cuts in r^einiant 
DNA molecules 

A restriction fragment spanning the intron insertion 
site was examined for double-stranded breedcs in mated 
cells. The strains 5DSS/O273-10B and C0P-l9/norbensis 
were mated, and mtDNA was isolated at the beginning of 
mating said 3, €, 9, and 24 hr after mating begem. The 
DMA was digested with either Hpa II (A) or Hae III (B) 
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and analyzed to detect double-strand cuts near the sites 
of insertion for omega and al4a, respectively. HtDNA was 
fractionated on 6% polyacrylamide gels, 
electrophoretically transferred to Hybond-H membranes, 
5 and hybridized to end-labeled oligonucleotides specific 
to 21S RRNA gene exon sequences (A) or to a sequences 
(B) • A 0.6 kb fragment (arrow) corresponding to an In 
vivo double-stranded cut near the site of omega insertion 
was seen from 6-9 hr after initiation of mating in the 
10 Hpall digest (A). Similarly^ a 0.9 kb fragment (arrow) 
consistent with a double-stranded cut near the site of 
al4a insertion was present from 6-9 hr after mating in 
Haelll digests (B) . 

15 FIGURE 8: Mitochondrial e3rt:r acts contain an 

endonuclease that cleaves recipient D NA near the site of 
jptyon ihgertAQB 

Mitochondrial extracts were assayed for their 

20 ability to cleave recipient (R) or donor (D) DNA. 
Plasmids pDRl (recipient) and pJWl (donor) were 
linearized with EcoRl and 3' end-labeled. These 
substrates were incubated with mitochondrial extracts 
from wild-type strain ID41-6/161 and mutant derivatives 

25 PZ27, G, and K. Cleavage of pDRl DNA at or near the site 
of intron insertion would yield products of 2.1 kb and 
3.0 kb (lanes 3, 5, and 9). Cleavage of pJWl DNA at the 
equivalent exon site would yield products of 1.2 kb and 
2.8 kb« The substrate and extract uised are indicated 

30 above each lane: 39 ng of extract protein from strain 

PZ27 was added to lanes 3, 4, and 5; 39 ng and 3.65 M9 of 
extract protein from wild-type strain 141-6/161 was added 
to lanes 6 and 7, respectively; and 150 ng of extract 
protein from mutant K and 182 ng of extract protein from 

35 mutant 6 was added to lanes 8 and 9, respectively. 
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PIGDRE 9: Preliminary cleavaae site de1^«»rminat|nn nf 
the endonucleaae 

Single-stremded DMA from phages RS18 and RS19 was 
labeled using the Sequenase {United States Biochemical 
Corporation) protocol and the universal sequencing 
primer. This material was then extended either using 
dideoxynucleotides (to generate a sequencing ladder) or 
with unlabeled nucleotides (to generate the substrate) . 
substrate samples were cleaved with a mitochondrial 
extract from strain PZ27 (see Figure 7) . Figures 9 A and 
9B show the sequencing gels generated by this protocol 
using phages RS19 and RS18 as templates, respectively, 
in each panel, lanes 1-4 represent chain termination 
reactions with dideoxy G, A, T, and C, respectively; lane 
5 is the product of a cleavage reaction on the 
"end-labeled" substrate; lane 6 shows this region of the 
gel using uncut substrate to demonstrate no premature 
primer extension products in the vicinity of the cleavage, 
site. Figure 9C shows the sequence surrounding the 
cleavage site. Intron sequences are shown in lower case 
letters while exon sequences are in upper case letters. 
The large arrow indicates the point of intron insertion. 
The sites of cleavage on each strand are shown by small 
arrows. The exon sequences of donor and recipient are 
nearly identical; two differences in the exon between 
donor and recipient are shown by asterisJcs. The bracket 
sets an upper limit on the boundaries of the recognition 
site (based on the data of Figure 10). 

FIGURE 10: Location of the reeo gnitiQn B*.>Tuenee of 
the al4a-encode d endonuelgagA 

The recognition sequence was delimited by 
determining the extent of each sequencing ladder of 
Figure 9 that was resistant to cleavage by the 
endonuclease activity. The products of sequencing 
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reactions using phage RS19 as a template are shown in 
Figure lOA, lanes 1-4 (terminated with dideoxy G, A, T, 
and C, respectively) . These sequencing ladders were used 
as a substrate for cleavage reactions using 500 ng of 
5 protein extract from strain PZ27 in lanes 5**8. The 
ladder was. resistent to cleavage until a recognition 
sequence was generated. All chains with further 
additions were efficiently cleaved. Figure lOB shows the 
results of a similar experiment using RS18 as a template. 
10 Lanes 1-4 show the products of the sequencing ladder 

(terminated with dideoxy G, k, T, and C, respectively) . 
In lanes 5-8, the products of the ladder were cleaved 
with extract from strain PZ27. The maximum boundaries of 
the recognition sequence sure shown in Figure 9C. 

15 ■ 

This invention is directed to a new and unique 
endonuclease, and to methods of preparing and using this 
enzyme. 

20 Although other endonucleases have been discovered 

which cleave DNA at specific sites, there is clearly a 
need for more endonucleases, in particular those with 
special and useful properties, to facilitate gene 
mapping. The endonuclease described herein has the 

25 capability of producing relatively larger fragments of 
DNA and also to act on restriction sites (cleavage 
sites) , that are commonly encountered in the genome. 
Some enzymes operate only on unusual sequences, e.g., 
only 6C base pairs. 

30 

The general objective of this invention is to purify 
and apply a new endonuclease protein with useful 
characteristics not previously available to the art. 
Prior to the invention described herein it had not been 
35 possible to isolate and purify the endonuclease protein 

from its natural source so that its sequence and function 
could be analyzed. 



r 
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The endonuclease which is disclosed by the current 
invention will cleave ordinary dna sequences. Use of 
this endonuclease for gene mapping will facilitate and 

5 7Zl T "T''" «^«ter.ination of DNA sequences 
5 within fragments cleaved by the endonuclease. 

In one of the preferred embodiments, the 
endonuclease as extracted from mitochondria of strain 
WAX2/PZ27 yields a biological activity of up to abo^t loo 

10 unzts/mg per protein. Ti.ese mitochondrial extracts 
further processed by phosphocellulose chromatography, 
resu ted in the purity increasing to 34,000 units/mg. 
^Plicatxon Of Affigel* heparin chromatography increased 
the purity of the endonuclease up to about 50,ooo 

15 units/mg. Fractionation by gel filtration yielded a 
biological activity of 200,000 units/mg. Further 
purification by D«A affinity chromatography yields a 
biological activity of 500,000 units/mg (Figures 3-6). 

20 20 kilodaltons as determined by SDS-PAGE. 

A maximum of 14 base pair sites are believed to be 
required for cleavage. There are at least five residues 
known to be essential for the cleavage; in other 

wuT^i!; ^^^^'^-^^ons are permitted and cleavage 

will rtill occur (Figure l). 

^ Host Of the endonucl«i,es diecovered to date have 

30 !!ii ^T''^ «• '^'-^ t" 

^ b.cu« it 1. nor, dl«lcult to purity endonuclease 
fr<» more coi^iex eyate^. This liiritetion wm also due 
to the fact that h.ct«ri. i«»pabl. of producing 

to provide «t«:ial on «>lch purification proc«l„r.s may 

js . be employed. 
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The mitochondrial genomes of many yeast species have 
been mapped and shown to contain both introns and exons 
(deZamaroczy and Bernardi, 1986) • Tvo of the most 
Informative loci for genetic investigations are the 
5 mosaic genes cob (which codes for cytochrome b) and 6xi3 
(which codes for subunit I of cytochrome £ oxidase. 
(Figure 2). Many of the introns in these two genes have^ 
open reading frames, sequences which sure read as 
extensions of the exon immediately preceding it. An 
10 unusual circumstance is that most introns in 

mitochondrial yeast genomes may be expressed as proteins. 

In common with yeast, the mammalian systems have 
genes for cytochrome b, cytochrome oxidase and subunit s 

15 of ATPase. The known mitochondrial gene products vary 
little among eukaryotes as diverse as funigi and 
vertebrates. The yeast mitochondrial igenome consists of 
large nxmbers of conserved DNA sequences plus 
interruptions by an assortment of so called optional 

20 sequences. The genome of a given yeast strain appears to 
be very stable. Coding sequences among strains are 
conserved, but optional sequences are present or absent 
at various positions in many combinations. The wide 
range of functional combinations of introns arose during 

25 speciation of yeast. Further combinations of those 
probably arose in laboratory strains by unselected 
mitochondrial recombination during the many decades of 
genetic manipulation in which current laboratory strains 
were derived (Butow, 1985) . The operational definition 

30 of optional sequences is that the presence or absence of 
these sequences do not affect phenotypic expression of 
the exon coding region. 

Much has been learned about the mechanism of RNA 
35 splicing by examination of yeast systems. KNA splicing 
normally removes the introns from the transcripts of 
genes by breedcing the phosphodiestisr bonds at the 
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exon-intron boundaries, and subsequently forming bonds 
between the ends of the free exons. There are various 
mutants of yeast which are unable to splice the introns; 
therefore the precursors accumulate in the cells. This 
5 can occur in either the nucleus or the mitochondrial 

genes or both. The mitochondrial introns in yeast may be 
divided into two types, group 1 and group 2. Group 1 
introns appear to have a common secondary structure, 
short conserved sequences internally, but no specific 

10 conservation of sequences at the splice junctions. Some 
of these introns in the coxi gene and the single intron 
of the large mitochondrial ribosomal SNA gene of 
Saccharomvces cerevisiae^ are self -splicing* Mutations 
in other genes, however, can affect the occurrence of 

15 splicing. In both group 1 and group 2 there are some 

introns which must translate an extensive coding sequence 
to splice the intron containing it. Some introns encode 
a protein involved in splicing, called RNA maturases. 
Maturases appear to result from translation of both exons 

20 and introns. Both homologous and non-reciprocal 

mitochondrial recombinations occxir in genetic crosses of 
yeast. Non-reciprocal exchanges involving some optional 
sequences have been reviewed by Perlman et al. (1989) • 
The best characterized system of non-reciprocal exchange 

25 analyzed in yeast is the omeaa system. Omeaa is a 1.1 kb 
optional intron in the 2is RRNA gene. In genetic crosses 
between omeaa positive and omeaa negative yeast, the 
progeny are usually omeaa positive. That is, there seems 
to be unidirectional transfer of the intron. Mutants 

30 lacking this polarity of transfer occur either as base 
changes close to the site where the intron would be 
inserted r or as mutations in the intron reading frame 
preventing its translation. The protein product of the 
omeaa intron is not involved in omeaa splicing. 



The hypothesis generated from studying the omeaa 
system is that the protein coded by the intron in the 
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omega positive strain ( fit /or omeaa transposase/ISccl) 
recognizes the DNA site where the intron should be 
inserted in an omeaa negative strain. This causes it to 
be preferentially inherited. The protein coded by the 
5 omeaa intron has endonuclease activity recognizing a 

specific sequence in the omega negative gene ais a target 
for cleavage of the doublestranded DKA (Jacquier and 
Dujon, 1985; Colleaux et al., 1986). The recognition 
site lies at the site where the intron is inserted. The 
10 dotible-strand break probably initiates what is called a 
gene conversion, a process in which the sequence of the 
omeaa positive gene is copied and replaces the sequence 
of the omeaa -neaative gene. 

15 Cleavage of intronless (omega-negative) DNA near the 

site of intron insertion is required for conversion, and 
it is believed that introns carrying sequences which 
accomplish gene conversion may have once been independent 
elements that coded their own splicing or interacted with 

20 the DNA of the recipient strain. This creates a 

transposable element (tremsposon) . Finding that such 
gene conversions are required for cleavage of the 
recipient DNA by a site specific endonuclease, leads to 
the expectation that isolation and purification of such 

25 endonucleases would provide extremely valuable not only 
in understcmding the genetic mechanisms of these 
conversions but also could provide a valuable reagent for 
gene mapping studies. 

30 When two parental strains having distinguishable 

mitochondrial DNAs are mated, the progeny are expected to 
represent the parental DNA in some aspects. 
Recombination refers to the new union of DNA present in 
separate parents, in a single progeny DNA. Mitochondrial 

35 DNA from different parental strains is capable of 
recombining. 
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Through studies of gene conversion in yeast 
mitochondria, it has been deterained that one of the 
mechanisms achieving gene conversion en^loys an 
endonuclease aI4o which cleaves the recipient DNA at 
5 specific sites (Wenzlau et al., 1989). For one of these 
endonucleases, the aI4a-encoded protein, there exist, 
however, yeast strains that overproduce or overexpress 
the enzyme relative to wildtype yeast strains because of 
various mutations of the yeast mitochondrial genome. An 

10 example of .such a strain of yeast is the WA12 strain 
containing the PZ27 mutant mitochondrial DNA of 
Sapgh^yOlBYgga cerevlatae. Strain PZ27 vas isolated in 
1978. In the mitochondrial genome of this strain the 
cytochrome oxidase gene (c^) is present, but the 

15 cytochrome b gene is deleted. This configuration blocks 
splicing of al4o causing the protein encoded by that 
intron reading frame to be present at high levels in 
mitochondria. This protein is a site-specific 
endonuclease that is translated from a fusion between the 

20 upstream exons of coxl and the open reading frame within 
the intron aI4a. All yeast have the coxl gene, and most 
contain intron aI4o. A few strains, however, have lost 
the capability of splicing the intron, leading to 
overproduction of an endonuclease. 

25 

In wild-type yeast strains splicing of this intron 
4a is assisted by a product of the fourth intron of the 
cytochrome b gene called the bI4 maturase. Splicing 
would normally remove the intron. However, there are 

30 several strains in which there is an inability to splice 
aI4a resulting in an accumulation of the endonuclease 
produced by that region of DNA. This overproduction 
occurs, for example if there are mutants at the bI4 locus 
or deletions of the bI4 reading frame, m strain WA12 

35 containing mtDNA from strain PZ27 there is a deletion of 
the C2b gene, consequently a deletion of intron bI4, 
leading to continued production of the endonuclease 
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produced by the coxl gene because there Is no splicing of 
the intron. Otherwise the strain has a wild-type es£L 
gene based on complementation studies « 

5 Some other strains exist with defects in the bI4 

maturase that would lead to this overproduction. In fact 
any strain lacking bI4 maturase and thereby unable to 
splice out aI4a will lead to overproduction of the 
endonuclease. A preferred embodiment is to use the WA12 

10 strain as a source of crude extracts of mitochondria. 

After culturing the yeast, mitochondrial extracts can be 
prepared by standard methods of lysing cells and 
fractionation. These extracts can be further purified by 
being subjected to phosphocellulose chroxnatography, 

15 affigel hepaurin chromatography, gel filtration and DNA 
affinity chromatography. Any other suitable methods of 
purification of proteins known to those skilled in the 
art may be applied. 

20 The endonuclease described herein was capable of 

cleaving double-stranded DNA at a specific site to 
achieve gene conversion in recipient strains. (Figures 
7-8) . It also has the potential to become a maturase 
under certain conditions. A matvirase is an 

25 intron-encoded RNA splicing protein. It has been 

reported that the maturase coded in yeast by the fourth 
intron of the cytochrome b gene (bI4) is essential for 
splicing the fourth intron of the gene encoding subunit l 
of the cytochrome c oxidase f coxl gene) • The coxl intron 

30 4a encodes a protein that is closely related structtirally 
to the bI4 maturase, but in wild-type yeast strains is 
not utilized in splicing either the fourth intron of coxl 
or cob . However, the aI4a product can be activated to a 
new maturase form by either a point mutation (mim-2) 

35 (Dujardin et al., 1982) or the presence of particular 
mutant forms of the nuclear NAM2 gene (Herbert et al., 
1988). 
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Although intron-encoded endonucleases have not 
previously been isolated and purified from yeast 
mitochondria, the use of special laboratory strains in 
this invention which continue to produce these products 
5 due to lack of splicing, permitted the isolation of 
endonucleases in amounts sufficient to be isolated, 
purified and characterized. The function and structure 
of one of these endonucleases was determined for the 
first time as an aspect of this invention because of the 

10 increased amounts obtained from the special strain 

WA12/PZ27 of Saccharomyces cerevisiae . The specific site 
of cleavage of double-stranded DNA was determined by 
presenting known sequences of DNA of different lengths to 
the endonuclease and studying the resulting cleavage 

15i products. A partial recognition site determination is 
shown in Figure 10. 

The purified fraction is then selected which 
comprises an endonuclease capable of cleaving DNA at 

20 specific sites. This capability can be tested by various 
assays. The cleavage site, plus flanking DNA sequences 
can be introduced into vcurious plasmids to use as an in 
vitro assay. Essentially any plasmid DNA containing the 
cleavage site will be suitable. In a preferred 

25 embodiment the assay used comprised a plasmid pRSX which 
contains a 292 base pair, Rsal-Hindlll fragment that 
includes the cleavage site from Saccharomvces caoensis 
cloned into the vector PBS(+) from Stratagene. 



30 



Two assays were used to determine the activity of 
the endonuclease: 



!• A supercoiled DNA preparation of pRSX was incubated 
with the endonuclease, and cleavage activity was 
35 determined by gel electrophoresis of the subsequent 

fragments, if cleavage has occurred there will be 
conversion of the supercoiled DNA to the linear form 



) 
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of the plasmld which can be determined by gel 
electrophoresis ; 

2. pRSX DNA was linearized with SSSI. Using standard 
techniques the linear plasmid was end-labelled, 
extracted, and endonuclease activity was determined 
by the production of two radioactive fragments after 
incubation with the enzyme for which activity is to 
be determined. 

An object of this invention is to cleave DNA at 
specific sites. Determining whether this was achieved by 
preparing the endonuclease protein as defined above, was 
done by incubating the enzyme with DNA so as to 
effectuate the cleaving of DNA. 

Strains shown in Figure 2 were used in crosses to 
study the transmission and recombination behavior of 
three group 1 introns of the gosa gene. The data shown 
in Figure 7 indicated that the mated cells contain an 
endonuclease activity capable of cleaving recipient 
genomes in the vicinity of the site of conversion. 
Experiments employing petite mutants of yeast which do 
not express mitochondrial genes ^ showed that no 
conversion occured in petites. The conclusion from these 
studies was that a product of mitochondrial protein 
synthesis is needed for aI4a conversion. Results of 
crossing strains of yeast differing in their intron 
configuration of three mosaic genes ( coxi . sah, 2is 
rRNA) , and use of various strains having point mutations 
of aI4a, led to the conclusion that the al4a intron 
likely encodes the information needed for aI4a 
conversion. The aI4a reading frame must be intact to 
effect conversion. 



In this invention, purification and subsequent 
application of an enzyme not previously purified, w« 



• 
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achieved and found to cleave DNA as shown in Figure i. 
The endonuclease requires Mg^* at a concentration of about 
25inM for activity and shows high activity at pH 7.5 in 
the presence of 200xnM salts of monovalent cations. 
Activity does not appear to be stimulated by the addition 
of nucleoside triphosphates (see also Figure 8). Other 
properties of the endonuclease illustrated in Figure 8 
are: lane 3: the mitochondrial extract from strain PZ27 
cleaves the recipient plasmid once near the intron 
insertion site (this linear product can be circularized 
by DNA ligase); lane 4: donor DNA is not cleaved; lane 5: 
only recipient DNA is cleaved; lanes 6 & 7 show that 
cleavage activity is not detected in extracts of 
mitochondria from a wild-type strain capable of efficient 
splicing of aI4a; lane 8: no cleavage using extract from 
an aI4a mutant with a truncated intron ORF; lane 9: 
cleavage from an extract of an aI4a mutant with an intact 
intron ORF. 

The endonuclease has been shown to be active in E, 
SBlL by use of an artificially engineered sequence of the 
endonuclease coding sequences in a transforming vector, 
using the universal genetic code. However, it is highly 
toxic in fi. co^i probably due to its cleavage of double 
stranded E. coll DNA (Delahodde et al., 1989). 

A* major advantage of the purified endonuclease, is 
that it cuts DNA relatively infrequently, and 
consequently is particularly useful for genome mapping. 
If all 14 nucleotides are required for cleavage, then one 
can estimate 1 to 2 cleavage sites per human chromosome. 
However, because there is some relaxation in the 
specificity of the cleavage site, empirical tests leading 
to actual detein&ination of cuts effected by the 
endonuclease in clonal portions of human chromosome 
number 3, show one site about every 200 kilobase pairs. 
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Since the DNA eecpience which the Inventors have 
discovered to encode for the endonuclease is available 
(Bonitz et al. 1980), it could be engineered for 
expression in, e.g., E. coli (Delahodde et al. 1989). 
5 However, the preferred embodiment is to prepare this 

endonuclease from yeast mitochondrial extracts because of 
possible variations in the protein during the course of 
molecular engineering and expression in other species and 
because of its apparent toxicity to bacterial cells* 

10 

A major current scientific project is to map the 
entire human genome. This is an expensive, 
labor-intensive, and difficult task which will be 
facilitated by use of the endonuclease described in this 
15 invention. Human gene mapping permits detection of 

carriers of abnormal genes, and isolation and cloning of 
genes, which cause abnormal human conditions. A goal of 
the mapping project is further elucidation of gene 
action, paving the way for therapy. Complete mapping of 
20 the human genome will also contribute to the field of 
preventive medicine by maJcing it possible to determine 
the susceptibility of individuals to specific genetic 
disorders. Because the phenotype (expression of such 
genes) represents the interaction between genes and 
25 environment, preventive action based an knowledge of the 
genetic. complement of the individual may be achieved by 
altering their environment (diet, drug, medication) to 
reduce or eliminate the severity of the abnormal genetic 
defects. The ultimate goal, of course, is to identify 
30 these genes so that direct gene therapy will be possible. 

If the genomic sequences are known, then normal genes may 
be directed into the genotype, perhaps at early stages of 
development, to counteract the effects of abnormal genes. 
A further goal of gene mapping is to understand the 
35 interaction and interrelationships of genes on different 
peirts of the chromosome and on different chromosomes. 
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The purification, isolation and use of the 
endonucleasd which is an object of this invention are 
further described in the following examples: 

Example 1: Yeast st-i-a^ n Purification of the 
endonuclease is made possible by the use of a strain or 
strains that overproduce the enzyme. In wild-type yeast 
strains, splicing of al4a is assisted by the "maturase" 
product of the fourth intron (bl4) of the cytochrome b 
gene (safe) ; mutants of the bl4 maturase, or deletions of 
all or portions of the bI4 reading frame, result in a 
block in splicing of aI4a. Because the mRNA for ai4a 
endonuclease is a fusion of the intron reading frame with 
those of the upstream exons, the inability to splice al4o 
results in an accumulation of the endonuclease mRKA and, 
hence, an overproduction of the protein. Therefore, to 
facilitate purification of the endonuclease protein, a 
strain (WA12) , containing the mutant mitochondrial 
genome, PZ27, which is deleted for the cab gene, is used. 
(Note, however, that any strain with a defect in the 
synthesis of the bI4-encoded maturase would overproduce 
the aI4a endonuclease and therefore be suitable as a 
starting strain for purification. ) 

Example 2; DWA giAA^^ r^^ ^^e cleavage site 

for the endonuclease is the sequence shown in Figure 1. 
This site is cleaved with high efficiency by crude and 
purified preparations of the endonuclease which catalyzes 
a 4 bp staggered double strand break as shown in Figure 
1, leaving 3' OH overhangs. Preliminary determination of 
the cleavage site is shown in Figure 9. The boundaries 
of the recognition site are shown in Figure 10. Methods 
are detailed in Wenzlau et al., 1989. 

fiy^mPle 3: Assays for endominiaa s e active i-y, por 
an in vitrs assay of endonuclease activity, the cleavage 
site shown in Figure 1, plus flanking DHA sequences was 
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introduced into various plasmids. Essentially any 
plasmid DNA containing the cut site is suitable. For 
the assays described herein, the plasmid pRSX containing 
a 292 bp Rsa I - Hind Til fragment that includes the cut 
5 site from ^ caoensis cloned into the vector pBS (+) from 
Stratagene was used. 

Two assays were conveniently used to follow enzyme 
activity: 

10 

a) Cleavage of supercoiled DNA . A supercoiled 
preparation of pRSX was incubated with the endonuclease 
under the assay conditions described below, and cleavage 
activity was evident, following gel electrophoresis of 

15 the saunple, as a conversion of the supercoiled to the 
linear form of the plasmid; 

b) Linearization and end-labe llina of pRSX, pRSX 
DNA was linearized with Seal. After phenol extraction 

20 and ethanol precipitation, the linear plasmid was 
end-labelled with ^^P dATP and T4 DNA polymerase by 
standard procedures. The labelled DNA was then extracted 
by phenol and the unincorporated dNTPs were removed by 
gel filtration. The radioactive pRSX DNA was ethanol 

25 precipitated and resuspended in sterile water. 

Endonuclease activity was followed by the production of 
two radioactive fragments following incubation with 
active enzyme. 

30 The endonuclease assay conditions used were as 

follows: Reaction mixtures (0.025 ml) contained 25 mM 
Tris-HCI, pH 7.5, 20 mM MgCl2, 100 mM KCl, 2 mM DTT and 
100 ng pRSX plasmid DNA. Incubation was carried out at 
30^0 for 30 minutes. Reactions were stopped by being 

35 placed on ice and a 1/10 volume gel loading buffer was 
added. The samples were analyzed by electrophoresis in 
0.7% agarose gels containing 1 iJ^g/ml ethidium bromide. 
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For purposes of this invention, one unit of endonuclease 
activity was defined as that amount that catalyzes the 
cleavage of 50 ng of the Dm substrate in one hour at 
30»C. Other definitions used in the art would be 
consistent with this invention. 



10 



Example 4; Protoeol for the P urification nf 
Endonucle^sft. The following table. Table I, is an 
overview of the protocol which was employed to purify the 
enzyme, and also indicates the degree of purification 
that was achieved at each step. 
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Table It SUMMARY OF PDRIFZCATZOIT 
OF THE RESTRICTION EMDONVCLEaSS 



Sequential 

Purification 

steps 



Activity* Yield 

U X 10"' 



Protein 
(mg) 



Specific 
Activity 

(U/mg) 



1. 


Crude Extract 


26 


100 


216 


100 


la. 


40-60% (NHJaSO* 


3 




114 




II. 


Phosphocellulose 


20 


79 


0.6 


34,000 


IH. 


Aff igel Heparin 


5 


19 


0.1 


50,000 


IV. 


Gel Filtration 


4 


16 


0.02 


200,000 


V. 


DNA Affinity 


1 


4 


0.002 


500,000 



♦For purposes of this invention, one unit of 
endonuclease activity was defined as that amount 
that catalyzed the cleavage of 50 nanograms of 
DNA substrate in one hour at 30"»C. other 
definitions of activity are within the scope of 
this invention. 



35 



(a) Preparation of Crod A. Mitochondrial Extrani- ^r,^ 
Ammonium sulfate Pract-i 



A S^gch^roff^yces cerevisiae strain (WA12/PZ27) was 
grown in medium composed of 1% yeast extract, i% 
bacto-peptone and 2% galactose to late log phase at 30 «C. 
The cells were collected by centrifugation, washed with 2 
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mH EDTA, at pH 8, and resuspended at 2 ml/g, vet weight, 
in buffer containing 33 nH potassium phosphate, pH 5*8, 1 
M sorbitol and 2inH EDTA. Mureinase was added at a ratio 
of 4 ng/gr wet weight, and the mixture was incubated on a 
5 platform shaker at room temperatmre for one hour. 

Spheroplasts were collected by centriguation at 1000 x g 
for 10 minutes at 3^C and resuspended in ice cold 
breakage buffer which contained 50mM tris-HCl, pH 7.5, 
0.6 M sorbitol, 2 mM EDTA, 2mM DTT and protease 
10 inhibitors (which were also included in all subsequent 
steps of the purification): PMSF (1 mM), leupeptin (1 
Hq/ml) and aprotinin (1 /Ltg/ml) . Spheroplasts were lysed 
by shaking with glass beads. The lysate was recovered 
and the glass beads were washed 3 times with breakage 
15 buffer. The combined spheroplast lysate was centrifuged 
at 1000 X g for 10 minutes at 3^C. 

A high salt extract of the crude supernatant was 
prepared by addition of 0.33 volume of 4 M KCl to achieve 
a final concentration of 1 M KCl. MgCla and DTT were, 
added to final concentrations of 10 mM and 2 mM, 
respectively. The high salt extract was stirred in the 
cold for 30 minutes and centrifuged at 15,000 x g for 15 
minutes at 3*'C. The supernatant (fraction I, Table I) 
was recovered and diluted 2 -fold by addition of ice cold 
buffer containing 50 mM Tris-HCl, pH 7.5, 2 mM EDTA and 2 
mM DTT. One-tenth volume of a 25% streptomycin sulfate 
solution was added and the extract was stirred in the 
cold for 30 minutes followed by centrifugatipn at 15,000 
X g for 15 minutes at 3®C. 

Solid ammonium sulfate was added to the crude 
supernatant over a period af 30 minutes to achieve 40% of 
saturation at O^C. Stirring was continued for another 30 
minutes to achieve 60% of saturation at O^C. Stirring 
was continued for another 30 minutes and the precipitate 
was collected by centrifugation and discarded. The 
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supernatant was recovered and additional solid ammonium 
sulfate was added over a period of 30 minutes and the 
precipitate, which contained the al4a endonuclease 
activity, was collected by centrifugation at 15,000 x g 
5 for 15 minutes at S'C and resuspended in buffer 
containing 10 mM potassium phosphate, pH 7.5, 45% 
glycerol and 2 mM EDTA and stored at -20»C (fraction la 
in Table I) . 



The next step in the purification was to subject 
fraction la (Table I) to phosphocellulose chromatography. 
The details of the procedure are described in Figure 3. 
Fractions showing endonuclease activity were pooled 
(fraction II in Table I) and dialyzed against Buffer A 
until an ionic equivalent of 50mM KCl was reached. This 
dialyzed fraction was then applied to an Affigel Heparin 
column (Figure 4). The active fractions from this 
procedure were pooled to form Fraction III (Table I) . 
This pooled fraction was then applied to a Sephacryl 
HS200 column (Figure 5) . The active fractions obtained 
from gel filtration were pooled to form Fraction IV 
<Table 1) . This fraction was applied to a DNA affinity 
column (Figure 6) . The pooled active fractions (V, Table 
I) were mixed with an equal volume of SomM potassium 
phosphate, pH7.5, 80% glycerol and 3mM EDTA and stored at 
-20»C, 

Example 5t Analysis of mtPWA. Mitochondrial DNA was 
prepared by the procedure of Bingham and Nagley (1983). 
Restriction enzyme digestions were performed as 
recommended by the enzyme supplier (Bethesda Research 
Laboratories) . Fragments were separated on 1% agarose or 
6% polyacrylamide gels and transferred by the well-known 
method of Southern (1975) or electrophoretically to 
Hybond-N membranes (Amersham) . Hybond-N membranes were 
prehybridized in 6x SSC containing lox Denha^dt's, 0.5% 
SDS, and 0.5 mg/ml of calf thymus DNA at 65 'c. 
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. Oligonucleotides were end-labeled with T4 polynucleotide 
kinase and allowed to hybridize to membranes for 2 hr at 
42 *C. Blots were washed three times for 15 min each at 
50 *C in 6x SSC. Other probes, generated by the 
5 multiprime DNA labeling system (Amersham) , were 

hybridized to membranes overnight at 42 *C. Blots were 
washed twice at 42 'C for 15 min each in 2x SSC, 
containing 0.1% SDS, suid twice for 15 min each in 0.2x 
SSC, containing 0.1% SDS. All filters were exposed to 
10 Kodak XAR-5 X-ray film. 

Example S: Extract Preparation and Endonuclease 
Assay . Galactose-grown cells were harvested at late 
logarithmic phase, and a mitochondrial fraction was 
15 isolated as described by Hudspeth et al. (1980) . All 
further manipulations were done at 4<»C. Mitochondria 
were resuspended in a solution containing 0.6 M sorbitol, 
25 mM Tris-HCl (pH 7.5), 25 mM EDTA, and 30% Percoll and 
were further purified by banding in self -forming Percoll 

20 gradients obtained by centrifuging the resuspended 

mitochondria at 15,000 x g for 20 min. The mitochondrial 
band was isolated and diluted with 10 vol of a solution 
containing 0.5 M NH^Cl and 25 mM Tris-HCl (pH 7.5) and 
centrifuged at 15,000 x g to pellet the mitochondria. 

25 Mitochondria were lysed in a solution containing 50 mM 

Tris (pH 7.5), 100 mM NH^Cl, 10 MM MgClg, 2 mM DTT, and 1% 
NP40. l!he extract was clarified at 15,000 x g, and the 
supernatant was used for endonuclease assays. Protein 
concentration was determined by the BCA protein assay 

30 (Pierce Chemical Company) . 

For the cleavage experiment described in Figure 8, 
the substrates pDRl and pJWl were linearized with EcoRl 
and the overhangs filled in with [a-'^P]dATP and dTTP 
35 using the Klenow fragment. Unincorporated nucleotides 
were eliminated on a spin column and the DNA was phenol 
extracted, ethanol precipitated, and resuspended in a 
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solution containing 10 xoH Tris (pH 7.5) and 1 m EDTA, 
Reactions were performed with 10 ng of substrate in a 25 
Ml reaction. Reaction conditions for this study were 25 
loM Tris{pH 7.5) , 25 mM KgClz, 200 jM m^Cl, and 2 mU DTT 
5 at 30«C, Reactions were allowed to proceed for 5 min and 
were stopped with 2.5 /il of a solution containing 50 inM 
Tris (pH 7.5), 100 mM EDTA, 2% SDS, 15% Ficoll, and 1% 
bromophenol blue. The reaction products were separated 
on a 1% agarose gel and visualized by autoradiography. 

10 

ExamplQ 7; Mapp|pa of t:he Cleavage and Recognition 
Siteg. Using a modification of the Sequenase<» (United 
states Biochemical Corporation) sequencing strategy, the 
cleavage site was mapped by generating double-stranded 

15 substrates from single-stranded templates (phages RS18 
and RS19) bearing the cleavage site. Two nanograms of 
the universal sequencing primer was annealed to 1 ng of 
each template in a 20 ^1 Sequenase® labeling reaction. 
The reaction was extended with 1 mM dNTPs to generate the 

20 "end-labeled" substrate. One-fifth of that material was 
cleaved with PZ27 extract as described above. An aliquot 
(1/50) was separated alongside a seqpiencing ladder 
generated from the rest of the labeling mixture and 
separated on 5% polyacrylamide gels. To determine- the 

25 boundaries of the recognition site, cleavage reactions 
were performed on the sequencing ladders themselves 
(Figures 9 and 10) . 



While the invention is susceptible to various 
modifications and alternative forms, specific embodiments 
thereof have been shown by way of example in the drawings 
and have been described in detail. It should be 
35 imderstood, however, that it is not intended to limit the 
invention to the particular forms disclosed, but on the 
contrary, the intention is to cover all modifications. 
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equivalents, and alternatives falling within the spirit 
and scope of the invention as defined by the appended 
claims. 
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CIAIMS: 

1. An endonuclease protein having an apparent molecular 
weight of about 29,000 daltons as determined by SDS-PAGE, 
5 and which is capable of cleaving double-strsmded DMA at a 
specific site identified as follows: 



10 




15 . 2. The endonuclease protein of claim 1, further defined 
as a mltochondrial-derived enzyme. 

3* The endonuclease protein of claim 2, further defined 
20 as a mitochondrial enzyme derived from yeast. 

4. The endonuclease protein of claim 3, further defined 
as a Saccharomvces cerevistae^ derived enzyme. 

25. 

5. The endonuclease of claim 1 further defined as having 
a biological activity of up to about 100 units/mg protein 
in crude extracts. 

30 

6. The endonuclease of claim 5 further defined as having 
a biological activity of up to about 34,000 units/mg 
after phosphocellulose chromatography. 

35 
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7. The endonuclease of claim 6 further defined as having 
a biological activity of up to about 50,000 units/mg 
after affigel heparin chromatography. 

5 

8 . The endonuclease of claim 7 further defined as having 
a biological activity of up to about 200,000 units/mg 
after gel f iltration. 

10 

9. The endonuclease of claim 8 further defined as having 
a biological activity of up to about 500,000 units/mg 
after DNA affinity chromatography, 

15 

10. The endonuclease of claim 1 wherein the endonuclease 
is translated from a fusion between the upstream exons of 
the mitochondrial cytochrome oxidase subunit I gene 
(cosl) of yeast and the open reading frame (ORF) within 

20 the 4th intron (aI4a) of the coxl gene« 

11. The endonuclease of claim I wherein the enzyme is 
capable of cleaving recipient DNA molecules near the site 

25 of yeast mitochondrial coxl intron (aI4a) insertion. 

12. The endonuclease of claim 1 which is also capable of 
acting as a maturase under certain conditions, said 

30 conditions comprising the coincidence of a point mutation 
in the intron reading frame (the mim-2 mutation) , or the 
presence of the nuclear NAH2 gene. 

35 13. A method for preparing an endonuclease having the 
capability of cleaving double stranded DNA, comprising 
the steps: 
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culturing yeast that are incapable of splicing the 
al4a intron of the coxi gene; 

preparing a mitochondrial extract from the yeast; 
fractionating the extract; and 

selecting a fraction or fractions which comprises a 
endonuclease as defined by claim i. 



14. The method in claim 13 wherein the yeast comprises 
the WA12/PZ27 strain of Saggnarmn^fB fiSESZisiaa. 



15 



The method in claim 13 wherein the yeast comprises a 
strain incapable of expressing the bI4 maturase activity 
of the cytochrome b gene (sab) in an appropriate culture 
media. 



16. A method of cleaving DNA comprising the steps of: 

preparing an enzyme as defined by claim l; and 

incubating the enzyme with DNA so as to effectuate 
- the endonucleolytio cleaving of the DNA. 
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Example 4 

nnantification of Normal KIT and Spl ice Mutant KIT antron 17 ntl ^""^ 

5 As the splice site mutation is present in only one of the duplicated regions of/ 
and not m the duplicated region of/, tiie various genotypes can be expected to 
have the attributes described in Table 3. 
TABLE 3 



Ratio of normal KIT 
to splice mutant KIT 



Genotype Copies of Normal Copies of KIT 

KIT containing the splice 

mutation 



I/I 2 2 1:1 

I/i 2 1 2:1 

i/i 2 0 2:0 

yf 3 1 3:1 

f/i 3 0 3:0 



Due to the dominance of allele /, three of the genotypes in Table 2 are carried 
by white animals and therefore can not be identified by phenotypic 
characterisation. Quantification of the relative amounts of the nomial KIT gene 
and the splice mutant KIT gene allows the ratio between the two to be 
15 calculated, and therefore the genotype of individual animals predicted. This was 
achieved by quantification of two DNA fragments following Malll digestion. 
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The amount of 134 bp fragment, representative of the normally sphced KIT 
gene, and of 54 bp fragment, representative of the splice mutant ATT, were 
measured following electrophoresis using GeneScan software. 

5 i. PGR to Produce DNA for Quantification 

As described in example 2 section i. The reverse primer KIT35 is labelled with 
the ABI fluorescent dye FAM at the 5' end 



ii Restriction Enzyme Digestion 

1 0 As described in example 2 section ii. 

iii Electrophoresis and Quantification of DNA Fragments 

Following digestion, 0.5 |il of the reaction volume was mixed with 2.5 |il of 
deionised formamide, 0.5 ^il of GS350 DNA standard (ABI) and 0.4 \i\ blue 

15 dextran solution before being heated to 90X for 2 minutes and rapidly cooled 
on ice. Three |il of this mix was then loaded onto a 377 ABI Prism sequencer 
and the DNA fragments separated on a 6% polyacrylamide gel in 1 X TBE 
buffer for 2 hours at 700 V, 40 mA, 32 W. The peak area of fragments 
representative to both the normal and splice mutant forais of KIT were 

20 quantitated using the GeneScan (ABI) software. 

iv. Ratio Calculations 

The peak area value of the 134 bp fragment (normal KIT) was divided by twice 
the peak area value of the 54 bp fragment (splice mutant KIT) in order to 
25 calculate the ratio value for each sample. 
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V. Results 

Analysis was performed on animals fix)m the Swedish wild pig/Large White 
intercross pedigree for which genotypes at / have been determined by 
cjonventional breeding experiments with linked markers. Figure 5 and Table 4 
show the ratio of normal to mutant ATT calculated for animals from each of the 
three genotype classes, ///(expected ratio 1:1), I/i (expected ratio 2:1) and I/f 
(pxpected ratio 3:1). The results are entirely consistent with the expected ratio 
values and indicate that the three genotype classes can be distinguished using 
this me&od. 

I 

TABLE 4 

Ratio of the Two KIT Forms in Different Dominant White Genotypes in a 
Wild Pig/Large White Intercross 



dienotype Phenotype Expected Observed Ratio Number 

I Ratio (NormaliMutan Tested 

(Normal: t) 
Mutant) 



±SE 



I/I White 1:1 1.15 ±0.075 13 

White 3:1 3.11 ±0.084 12 

Vi White 2:1 2.23 ±0.109 14 



Figure 5 illustrates that the range of ratio values calculated for the two 
genotypes I/I and I/f do not overlap. This enables animals carrying the / allele 
fo be identified and the frequency of the allele within different pig breeds 
20 determined. Ratio values were calculated for 56 Landrace and 33 Large White 
animals and the results are shown in Figure 6. A clearly bimodal distribution is 



I 
I 
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observed with 7 Landrace and 3 Large White individuals having a ratio value of 
approximately 3 or above, suggesting them to be heterozygous carriers for the 
/ allele (genotype I/f). This means / has gene frequency estimates of 6.25% 
(7/1 12 chromosomes tested) and 4.5% (3/66 chromosomes tested) within flie 
5 Landrace and Large White breeds respectively. 

Example 5 

Analysis for presence and quantification of the porcine KIT splice mutation 
10 using the PE ART TaoMan chemistry 

Method 

i. Preparation of ten:q>late DNA for PGR 

15 

DNA was prepared as in example 3, section i 

ii. TaqMan® PGR reactions 

20 TaqMan® PGR reactions were set up as shown in table 5 
TABLES 

PGR mix for TaqMan* based splice mutation test 



Reagent 



1 Ox TaqMan® Buffer A (Perkin Ebner) 
25mM MgGl^ Sol" 
DATP 
DGTP 
DGTP 
DUTP 

Amplitaq Gold™ (5U/jil) (Perkin Elmer) 
AmpErase™N-Glycosylase (lU/^il) (Perkin 
Ehner) 

KITTM -NEST-F (5^M) 
KnTM-NEST-R(5nM) 
KITTM FAM (5^M) 
KITTM TET(5^M) 



Final Gone"* 


Volume 


Ix 


2.50 ^1 


5mM 


5.00 ^l 


200^M 


0.50 jil 


200hM 


0.50 jtl 


200fiM 


0.50 nl 


200)iM 


0.50 ^1 


0.05U/nl 


0.25 nl 


O.OlU/^l 


0.25 111 


500nM 


2.50 nl 


SOOnM 


2.50 ^l 


lOOnM 


0.50 ^l 


lOOnM 


0.50 fil 
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25% Glycerol 8% 8.00 jil 

Porcine genomic DNA 100 ^l 

25.00 

: : liL- 

TTie PGR primers used were as described below: 

BdlTTM-Nest-F (5'-CTC CTT ACT CAT GGT CGA ATC ACA-3') and 

5 

KITTM -Nest-R (5'-CGG CTA AAA TGC ATG GTA TGG-3 
The TaqMan** probes used were: 

I 

10 KITTM-A FAM (5'-TCA AAG GAA ACA TGA GTA CCC ACG CTC- 
30 and 

KITTM -G TET (5*- TCA AAG GAA ACG TGA GTA CCC ACG C -3') 

15 The TaqMan® probes were prepared by Perkin Elmer and labelled with FAM 
and TET as indicated as well as the standard quenching group TAMRA. The 
IQx TaqMan® Buffer A, Amplitaq Gold™, AmpErase N-Glycosylase, NTP's 
and 25mM MgCl2 used were part of the TaqMan® PCR Core reagent Kit, 
supplied by Perldn-Elmer. 

20 

The reactions were then placed into a Perkin Elmer ABI Prism 7700 Sequence 
Detector and the reaction carried out using the following thermal profile, 50**C 
for 2 minutes, 95°C for 10 minutes followed by 40 cycles of 95^C 15s, 62°C 
60s. The reactions were carried out under the control of * Sequence Detector 
25 y.L6' software using the ^Single Reporter' and Real-Time' options with the 
'Spectral Compensation' Unction activated. Upon completion of the run real- 
time profiles for each sample were examined on the ABI7700 to check for any 
samples giving highly irregular profiles which were then excluded. The 
thresholds for both dyes, Fam and Tet, were set so that they intercepted each 



I 
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^ye dxiring the exponential phase of PGR. Following updating of the 

calculations in * Sequence Detector V.1.6' software results were exportated into 

I 

MS Excel for further analysis, 
iii. Analysis of results 

Based upon the underlying theoretical principle that one cycle of PGR gives a 
doubling in the amount of cleavage of the quenching dye from the allele 

specific probe and therefore doubles the signal one would expect the threshold 

I 

dycle numbers from the n and li genotypes analysed to be as below: 
Table 6: 

i 

theoretical results for TaqMan® analysis of genotype at tlie KIT splice 

mutation 



penotype 


Copies KIT 1 


Copies KIT 2 


Theoretical Ct 


Theoretical Ct 




(G) 


(A) 


TET(G) . 


FAM(A) 


n 


2 


2 


X 


Y 


K 


2 


1 


X 


Y+1 



in theory the Gt for TET and FAM signals, represented as X and Y should be 

I 

the same, as equal numbers of copies of the target sequences should be present 
in an // animal. However in practice this does not necessarily occur due to 
differences in the hybridization and cleavage efficiency of the two probes and 
variation in the settmg of the threshold cycle between flie two dye signals. The 
reduction in splice mutant containing (A) sequences relative to those not 
containing the splice mutation (G) in the li animals ie 2:1 G:A ratio rather than 



SUBSTITUTE SHEET (RULE 26) 




wo 99/20795 PCT/GB98/03081 

31 

1:1 as for // genotype, should lead to the FAM signal reaching the threshold 1 
cycle later than the TET signal in the genotype li animals. The actual results for 
samples tested are shown in Table 7. 

5 Table? 

Ct values from analysis of n and li genotypes 

Sample Genotype CtFAM(A) CtTET CtPAM-QTET 



(G) 


1 


li 


24.68 


22.59 


2.09 


2 


li 


25.98 


23.62 


2.36 


3 


li 


26,54 


25.57 


0.97 


4 


li 


27.37 


24.78 


2.59 


5 


li 


24.94 


21.61 


3.33 


6 


li 


25.68 


22.1 


3.58 
li Mean » 2.49 


7 


n 


22.05 


23.78 


-1.73 


8 


n 


24.22 


24.59 


-0.37 


9 


n 


24.19 


23.85 


0.34 


10 


n 


23.66 


23.51 


0.15 


11 


n 


24.35 


22.71 


1.64 


12 


II 


22.82 


21.69 


1.13 


13 


n 


22.84 


22.7 


0.14 


14 


u 


23.17 


22.9 


0.27 
Mean = 0.20 
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No 35 35 0 

Template 

No 35 35 0 

Template 

No 35 35 0 

Template 

No 35 35 0 

Template 

Despite variation around the mean values it can be seen from Table 7 that there 
is a significantly mcieased delay in the FAM signal reaching the threshold level 
(approximately 2 cycles) relative to the TET signal in li animals compared to II 
animals as predicted, reflecting the reduced number of copies of flie splice 
mutant (A) sequence present in animals of the li genotype. Plotting of the 
individual samples on a scatter plot (Figure 7) shows clustering of the two 
genotypes with the li cluster shifted along the Ct FAM axis due to the reduced 
number of copies of the KIT2 (A) sequence for which the FAM probe is 
specific. 
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CLAIMS: 

1 . A method for determiniiig coat colour genotype in a pig which comprises: 



S (a) obtaining a saixiple of pig nucleic acid; and 

(b) analysing the nucleic acid obtained in (a) to determine whether a mutation 
is/is not present at one or more exon/intron splice sites of the KIT gene. 

10 2. A method as claimed in claim 1 wherein the analysis in step (b) is carried 
out to determine whether a mutation is/is not present at the exon 1 7/intron 17 
boundary. 

3. A method as claimed in claim 2 wherein the mutation consists of the 
1 5 substitution of the G of the conserved GT pair for A. 

4. A method as claimed in any one of claims 1 to 3 wherein fiie sample of 
nucleic acid is amplified prior to analysis. 

20 S . A method as claimed in claim 4 wherein the nucleic acid is genomic 
DNA. 

6. A method as claimed in claim S wherein amplification is carried out 
using PGR and at least one pair of suitable primers. 

25 

7. A method as claimed in claim 6 wherein the pair of suitable primers is: 
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5'-GTA TTC ACA GAG ACT TGG CGG 0-3'); and 
5'-AAA CCT GCA AGG AAA ATC CTT CAC GG-3\ 

8. A method as claimed in any one of claims 5 to 7 wherein after 
5 sunplification the nucleic acid is treated with a restriction enzyme, followed by 

suialysis of fragment lengths. 

9. A method as claimed in claim 8 wherein the nucleic acid is treated widi 
Ae restriction enzyme Main. 

10 

10. A method as claimed in claim 8 or claim 9 wherein the ratio of 
restriction fragment lengths is determined. 

11 . A method as claimed in claim 4 wherein the nucleic acid is mRNA. 

15 

12. A method as claimed in claim 1 1 wherein the nucleic acid is amplified 
using RT-PCR. 

1'3. A method as claimed in claim 12 wherein the length of RT-PCR product 
20 is determined. 

1;4. A method for determining coat colour genotype in a pig which con^)rises the 
step of analysing a sample of pig KIT protein to determine whether the proteia is the 
splice variant protein. 
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15. A kit for iise in detennining the coat colour genotype of a pig which 
comprises one or more reagents suitable for determining whettier a mutation is 
present at one or more exon/intron splice sites of the KITgmo. 

5 

16. A kit as claimed in claim 15 which comprises one or more reagents for 
carrying out PGR and one or more pairs of suitable primers. 

17. A kit as claimed in claim 16 which comprises the following pair of 
10 primers: 



5'1-GTA TTC ACA GAG ACT TGG CGG C-3'); and 
51-AAA CCT GCA AGG AAA ATC CTT CAC GG-3'. 
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Sequence Alignment Across the Exon /Intron Border of KIT Exon 17 

Allele Gene Copy Sequence 

Exon 17 Intron 17 

a 

I KITl AAT TAC GTG GTC AAA GGA AACIGTG AGT ACC CAC GCT CTC CTG ACA GTC 
KIT2 lA 

I** KITl |G 

KITl |G 

i KITl |G 
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Ratio of normal to splice mutant KIT 
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Ratios for Landrace and Large Whits 
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Plot of Ct FAM vR nt TET for TaaMan based PGR 
analysis of porcine KIT s plice mutation genotype 
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ADN POLYMERASE THERMOSTABLE ET INTEINES DE L'ESPECE 
THERMOCOCCUS FUMICOLANS 

La presente invention concerne une nouvelle ADN 
5 i polymerase thermostable ainsi que ses deux inteines, 
; provenant d'une archaebacterie de I'espece Thermococcus 
; fumicolans. 

Les ADN polymerases sont des enzymes impliquees 
I dans la replication et la reparation de I'ADN dans toute 

10 1 cellule vivante. On connait aujourd*hui de nombreuses ADN 
i polymerases isolees de micro-organismes tel que E. coli 
; (ADN polymerase I) ou du phage T4 . Des ADN polymerases 
' ont aussi ete identifiees et purifi^es et a partir de 
, micro-organismes thermophiles comme Thermus aquaticus 

15 (Taq polymerase, Chien, A. et al. J. Bact. 1976, 

I 127:1550-1557 ; Kaladin et al. Biokhymiya 1980, 45:644- 
i 651) , Therinus therinophiius, ou encore des especes du 

■ genre Bacillus (demande de brevet Europeen publiee sous 
le No. 699 760), Theirmococcus (demande de brevet Europeen 

20 No. 455 430), Sulfoloibus et Pyrococcus (demande de brevet 
! Europeen publiee sous le No. 547 3 59) . Parmi ces ADN 
i polymerases issues d' archaebacteries on peut citer la 
i Pfu, isolees de Pyrococcus furiosus (18), la Vent^" 

polymerase de Thermococcus litoralis (10), la 9°N de 
25 i Pyrococcus sp. 9^N {15) et la DeepVent^" de Pyrococcus GB- 

, D, les deux premieres provenant de souches du littoral 
t (Bale de -Naples) , les deux suivantes de souches sous- 
marines profondes. 

Le mecanisme d* action des ADN polymerases est 
30 i aujourd'hui relativement bien connu et consiste en une 
^ replication de I'ADN ^ I'identique selon un mode semi- 
; conservatif. Le brin recopie sert de matrice et les 
I quatre nucleotides triphosphates sont le substrat de 
, cette polymerisation. Les enzymes ayant une activite ADN 
35 ^ polymerase sont aujourd*hui de plus en plus utilisees in 
vitro afin de travailler en biologie moleculaire dans 

■ divers buts tels le clonage, la detection d'erreurs, le 

I 
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:s6quengage, le marquage, et de fagon g^nerale, 
1 'amplification de sequences d'acides nucl6iques. 

Cette amplification, in vitro, de sequences 
d'acide desoxyribonucleique fait appel a la technique de 
5 la reaction de polymerisation en chaine (PGR) decrite 
dans les brevets Europeens No. 200 3 62 et 201 184. Le 
principe de cette technique est base sur la realisation 
de cycles successifs d' extension d' amorces mettant en 
oeuvre les quatre nucleotides triphosphates ainsi qu*une 
10 ADN polymerase et un ADN matrice a recopier, A chaque 
cycle, 1' enzyme double le nombre de brins d'ADN 
disponibles et entre chaque cycle une thermodenaturation 
est necessaire afin d'ouvrir la double helice d'ADN pour 
le cycle suivant . Les temperatures utilisees pour cette 
15 etape de thermodenaturation ne sont pas compatibles avec 
la conservation de I'activite de la plupart des ADN 
polymerases connues, telle la Klenow. C'est ainsi que des 
nombreux efforts de recherche ont ete realises afin de 
trouver des enzymes supportant ces temperatures. 
20 Cependant, si les micro- or gcuiismes thermophiles 

sont aujourd'hui connus, il reste encore difficile 
d'obtenir ces enzymes thermostables avec des rendements 
de production suffisants. La biologie mol^culaire et le 
genie genet ique permettent de palier cet inconvenient- 
25 Ainsi, une fois repere dans le genome, le gene codant 
pour I'ADN polymerase est clond, sequence puis redone 
dans un vecteur d' expression afin de produire la prot^ine 
dite alors recombinante , chez un h6te m^sophile plus aise 
a cultiver tel E. coli ou 5. CBrevisiae. Cette methode 
30 d' egression chez E. coli a notamment et6 decrite dans la 
demande de brevet Internationale PCT publiee sous le No. 
WO 89/06691 pour produire I'ADN polymerase de Thermus 
aquaticus. 

L'ADN polymerase de 1 ' invention provient d'une 
35 archaebacterie de I'esp^ce Thennococcus fumicolans. Outre 
ses proprietes thermostables la rendant particulierement 
efficace notamment dans une processus de PGR, cette ADN 
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polymerase est remarquable en ce qu*elle presente deux 
"introns proteiques", encore appeles "inteines", au 
niveau de son polypeptide precurseur. 

La sequence nucleotidique de ses inteines est 
5 inseree dans celle de I'ADN polymerase, gen^ralement au 
niveau de sites conserves impliques, apres traduction, 
dans les reactions catalytiques . Ces sequences sont 
transcrites et traduites en meme temps que celle de I'ADN 
polymerase et I'epissage autocatalytique des inteines 
10 produit alors trois enzymes: deux inteines et une ADN 
polymerase . 

On trouve de telles inteines egalement au sein 
d*autres molecules telles I'ATPase vacuolaire chez S. 
cerevisiae (4), GyrA chez Mycobacterium leprae (7), Rec A 

15 Chez Mycobacterium tuberculosis (5, 6). Les inteines font 
partie pour leur majorite de la famille des endonucleases 
de type "homing endonucleases" puisqu'elles coupent I'ADN 
en un site reconnu, a I'endroit mSme ou leur sequence 
nucleotidique vient s'inserer. 

20 Le developpement des biotechnologies tant dans 

la recherche que dans les domaines de la medecine ou de 
1 'agro-alimentaire, necessite de disposer de divers types 
d'ADN polymerases susceptibles d'am^liorer 
quant i tat ivement et qualitativement des techniques aussi 

25 diverses que le clonage, la detection, l' amplification de 
sequences d'ADN. La presente invention vise precisement a 
: of frir une nouvelle ADN polymerase thermostable qui est 
issue d'une espece recemment decrite: Thermococcus 
fumicolans (8) • Get isolat a et6 isole a partir de 

30 ' fragments de cheminees pr^lev^es dans le bassin Nord- 
Fidgien lors de la campagne franco- japonnaise STARMER en 
1989. Cette espece, anaerobie stricte, presente une 
temperature optimale de croissance de 90*=»C, ce qui est 
relativement eleve pour un Thermococcus. Son pH optimum 

35 est de 8,8, et son taux de salinite de 20 g/1 a 40 g/1. 

L* invention a done pour objet une ADN 
polymerase purifiee thermostable d' archaebacteries de 
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l*esp6ce Thermococcus fumicolans ayant un poids 
moleculaire de I'ordre de 89000 Da ainsi que ses inteines 
thermostables , dont le gene comportant les deux sequences 
codant pour lesdites inteines a ete clone. 
5 Les travaux de recherche ayant permis 

'd' identifier , de sequencer et d'^tudier I'ADN polymerase 
de 1' invention ont ete realisee a partir de la souche 
Thermococcus fumicolans ST557 deposee a la Collection de 
I'Institut Pasteur (CIP) sous le numero CIP 104680, Cette 
10 ADN polymerase sera d^nommee dans ce qui suit Tfu. Sa 
sequence de 774 acides amines est representee dans la 
liste de sequences en annexe sous le numero SEQ ID NO: 2. 
Un poids moleculaire de 89797 Da et un pi de 8.1 ont ete 
deduits de cette sequence. 

15 

L' invention concerne done I'ADN polymerase 
purifiee thermostable d • archaebacteries de 1 ' espece 
Thermococcus fumicolans ayant un poids moleculaire de 
I'ordre de 89 000 daltons ainsi que ses derives 

20 enzymatiquement equivalents . On entend par derives 
enzymatiquement equivalent, les polypeptides et proteines 
constitues par ou comprenant la sequence en acides amines 
representee dans la liste de sequences en annexe sous le 
numero SEQ ID NO: 2 des lors qu'ils presentent les 

25 proprietes de I'ADN polymerase de Thermococcus 
fumicolans. A ce titre 1' invention envisage plus 
iparticuli^ement une ADN polymerase dont la sequence en 
acides amines est representee dans la liste de sequence 
en annexe sous le numero SEQ ID N0:1 ou un fragment de 

30 icelle-ci ou encore un assemblage de tels fragments, comme 
la sequence de 774 acides amines representee dans la 
liste de sequences en annexe sous le numero SEQ ID NO: 2. 

En effet, la presence de deux inteines I-Tfu-1 
et I-Tfu-2 dans la sequence numero SEQ ID N0:1, sont 

35 susceptibles de conduire lors de la preparation par 
synthese chimique ou par genie genetique, a des sequences 
d'ADN polymerase de T. fumiculans tronqu^es dont les 
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proprietes enzymatigues sont equivalente a celle de I'ADN 
polymerase de T. fumicolans purifiee. 

On entend aussi par derives, les sequences en 
acides amines ci-dessus modifiees par" insertion et/ou 
deletion et/ou substitution d'un ou plusieurs 
aminoacides, pour autant que les proprietes de I'ADN 
polymerase de T. fuznicoJans qui en resultent ne soient 
pas signif icativement modifiees. 

L' invention concerne egalement une sequence 
d'ADN constituee par ou comprenant la sequence codant 
pour xine ADN polymerase de 1' invention. 

La sequence d'ADN represente dans la liste de 
sequences en annexe sous le numero SED ID NO: 1 
represente une telle sequence. L'ADN codant pour I'ADN 
15 polymerase de T. fumicolans et ses deux inteines est 
constituee par le nucleotides 357 a 5028. Les nucleotides 
1 ^ 356 correspondent a la region promotrice de ce gene. 
En consequence, 1 ' invention a pour objet une sequence 
d'ADN constituee par ou comprenant la sequence comprise 
20 entre les nucleotides 357 a 5028 de la SED ID NO: 1, ou 
un fragment de celle-ci ou encore un assemblage de tels 
fragments . 

1 ' invention se rapporte tout particulierement a 
une sequence d'ADN constituee par ou comprenant les 
nucleotides 357 a 1674 et 2755 a 3156 et 4324 a 5028 de 
la sequence d'ADN represente dans la liste de sequences 
en annexe sous le nxamero SED ID NO: 1. 

Cette sequence code pour I'ADN polymerase de T. 
fumicolans dont la sequence de 774 acides amines est 
representee dans la liste de sequences en annexe sous le 
numero SED ID NO: 2. 

L' invention concerne autant I'ADN polymerase 
isolees et purif iees de la souche Thermococcus fumicolans 
que I'ADN polymerase preparees par synthese chimique, par 
exemple par ligature de fragments polypeptidiques , ou 
encore par les methodes du genie genetique. 



